fix: validate reviewAgentLogPath to prevent path injection#1134
fix: validate reviewAgentLogPath to prevent path injection#1134
Conversation
Add path validation to invokeDiffReviewLlm to ensure the log path stays within the expected DATA_CACHE_DIR/review-agent directory. This addresses CodeQL alert #19 (js/path-injection) by resolving the path and verifying it does not escape the log directory. The validation is performed before each fs.appendFileSync call to prevent path traversal attacks even if the call chain changes in the future. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
WalkthroughLog path handling for the review agent was hardened: log directory resolution was centralized via Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Proc as processGitHubPullRequest
participant App as app.ts
participant Agent as invokeDiffReviewLlm
participant FS as Filesystem
participant LLM as OpenAI
Proc->>App: start processing PR
App->>Agent: call process with PR data
Agent->>Agent: dir = getReviewAgentLogDir()
Agent->>FS: resolve(proposedLogPath)
Agent->>Agent: validateLogPath(resolvedPath, dir)
Agent->>FS: write prompt file (validated path)
Agent->>LLM: send prompt -> await response
LLM-->>Agent: response
Agent->>FS: resolve(responseLogPath)
Agent->>Agent: validateLogPath(resolvedResponsePath, dir)
Agent->>FS: write response file (validated path)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
…keDiffReviewLlm.ts Export the log directory constant from invokeDiffReviewLlm.ts and import it in app.ts to ensure a single source of truth for the review agent log directory path. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts (1)
31-48: Minor: redundant double validation.
reviewAgentLogPathis a function parameter (a string) that doesn't change between the twovalidateLogPathcalls, so the second invocation on Line 46 is redundant. Validating once near the top ofinvokeDiffReviewLlm(whenreviewAgentLogPathis defined) would be simpler and still safe — there's no TOCTOU benefit here since the check is purely string-based.♻️ Proposed refactor
if (!env.OPENAI_API_KEY) { logger.error("OPENAI_API_KEY is not set, skipping review agent"); throw new Error("OPENAI_API_KEY is not set, skipping review agent"); } + + if (reviewAgentLogPath) { + validateLogPath(reviewAgentLogPath); + } const openai = new OpenAI({ apiKey: env.OPENAI_API_KEY, }); if (reviewAgentLogPath) { - validateLogPath(reviewAgentLogPath); fs.appendFileSync(reviewAgentLogPath, `\n\nPrompt:\n${prompt}`); } @@ if (reviewAgentLogPath) { - validateLogPath(reviewAgentLogPath); fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts` around lines 31 - 48, The code calls validateLogPath(reviewAgentLogPath) twice inside invokeDiffReviewLlm — once before appending the prompt and again before appending the response — which is redundant because reviewAgentLogPath is an immutable parameter; remove the second call. Update invokeDiffReviewLlm to validate reviewAgentLogPath once (when reviewAgentLogPath is truthy) before the first fs.appendFileSync, and keep the subsequent fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`) without re-validating; reference validateLogPath, reviewAgentLogPath, invokeDiffReviewLlm, and fs.appendFileSync to locate the lines to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`:
- Around line 10-17: Make REVIEW_AGENT_LOG_DIR an absolute canonical path by
using path.resolve on env.DATA_CACHE_DIR (e.g., REVIEW_AGENT_LOG_DIR =
path.resolve(env.DATA_CACHE_DIR, 'review-agent')), and update validateLogPath to
canonicalize both the base and the incoming logPath (use path.resolve) then
compute const rel = path.relative(REVIEW_AGENT_LOG_DIR, resolved); reject the
path when rel is empty, is absolute (path.isAbsolute(rel)), or startsWith('..')
and throw the existing error; reference symbols: REVIEW_AGENT_LOG_DIR and
validateLogPath.
---
Nitpick comments:
In `@packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`:
- Around line 31-48: The code calls validateLogPath(reviewAgentLogPath) twice
inside invokeDiffReviewLlm — once before appending the prompt and again before
appending the response — which is redundant because reviewAgentLogPath is an
immutable parameter; remove the second call. Update invokeDiffReviewLlm to
validate reviewAgentLogPath once (when reviewAgentLogPath is truthy) before the
first fs.appendFileSync, and keep the subsequent
fs.appendFileSync(reviewAgentLogPath, `\n\nResponse:\n${openaiResponse}`)
without re-validating; reference validateLogPath, reviewAgentLogPath,
invokeDiffReviewLlm, and fs.appendFileSync to locate the lines to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d34658fd-f05a-4b07-abcf-0440067a55bf
📒 Files selected for processing (3)
CHANGELOG.mdpackages/web/src/features/agents/review-agent/app.tspackages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts
|
test |
Change REVIEW_AGENT_LOG_DIR from a top-level constant to a getReviewAgentLogDir() function to avoid evaluating env.DATA_CACHE_DIR at module load time, which fails during Next.js build when environment variables are not yet available. Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
Co-authored-by: Michael Sukkarieh <msukkari@users.noreply.github.com>
This reverts commit 476081e.
* feat(web): GitLab MR Review Agent
Adds support for the AI Review Agent to review GitLab Merge Requests, mirroring the existing GitHub PR review functionality. Also fixes several bugs discovered during implementation and improves the shared review pipeline.
---
## New files
### `packages/web/src/features/agents/review-agent/nodes/gitlabMrParser.ts`
Parses a GitLab MR webhook payload into the shared `sourcebot_pr_payload` format. Calls `MergeRequests.show()` and `MergeRequests.allDiffs()` in parallel — the API response is used for `title`, `description`, `sha`, and `diff_refs` (which can be absent in webhook payloads for `update` action events), while per-file diffs are parsed using the existing `parse-diff` library.
### `packages/web/src/features/agents/review-agent/nodes/gitlabPushMrReviews.ts`
Posts review comments back to GitLab using `MergeRequestDiscussions.create()` with a position object carrying `base_sha`, `head_sha`, and `start_sha`. Falls back to `MergeRequestNotes.create()` (a general MR note) if the inline comment is rejected by the API (e.g. the line is not within the diff), ensuring reviews are always surfaced even when precise positioning fails.
### Test files (4 new files, 34 tests total)
- `githubPrParser.test.ts` — diff parsing and metadata mapping for the GitHub parser
- `githubPushPrReviews.test.ts` — single-line vs multi-line comment parameters, error resilience
- `gitlabMrParser.test.ts` — API call arguments, metadata mapping, diff parsing, edge cases (empty diffs, nested groups, null description, API failures)
- `gitlabPushMrReviews.test.ts` — inline comment posting, fallback behaviour, missing `diff_refs` guard, multi-file iteration
---
## Modified files
### `packages/web/src/features/agents/review-agent/types.ts`
- Added `sourcebot_diff_refs` schema/type (`base_sha`, `head_sha`, `start_sha`) and an optional `diff_refs` field on `sourcebot_pr_payload`
- Added `GitLabMergeRequestPayload` and `GitLabNotePayload` interfaces for webhook event typing
### `packages/web/src/features/agents/review-agent/app.ts`
- Added `processGitLabMergeRequest()` function mirroring `processGitHubPullRequest()`: sets up logging, runs the GitLab parser, generates reviews via the shared LLM pipeline, and pushes results
- Removed stale `OPENAI_API_KEY` guards (model availability is now enforced inside `invokeDiffReviewLlm`)
### `packages/web/src/features/agents/review-agent/nodes/invokeDiffReviewLlm.ts`
**Replaces the hardcoded OpenAI client** with the Vercel AI SDK's `generateText` and the shared `getAISDKLanguageModelAndOptions` / `getConfiguredLanguageModels` utilities from `chat/utils.server.ts`. The review agent now uses whichever language model is configured in `config.json`, supporting all providers (Anthropic, Bedrock, Azure, etc.).
- `REVIEW_AGENT_MODEL` env var (matched against `displayName`) selects a specific model when multiple are configured; falls back to `models[0]` with a warning if the name is not found
- Prompt is passed via the `system` parameter with a `"Review the code changes."` user turn, satisfying providers (e.g. Bedrock/Anthropic) that require conversations to begin with a user message
### `packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts`
**Fixes "Not authenticated" error** when the review agent calls `getFileSource`. The original implementation used `withOptionalAuth`, which reads a session cookie — absent in webhook handlers. Now calls `getFileSourceForRepo` directly with `__unsafePrisma` and the single-tenant org, bypassing the session-based auth layer. The webhook handler has already authenticated the request via its own mechanism (GitHub App signature / GitLab token).
### `packages/web/src/features/git/getFileSourceApi.ts`
- Extracted the core repo-lookup + git + language-detection logic into a new exported `getFileSourceForRepo({ path, repo, ref }, { org, prisma })` function
- `getFileSource` now handles auth and audit logging then delegates to `getFileSourceForRepo` — all existing callers are unchanged
### `packages/web/src/app/api/(server)/webhook/route.ts`
- Added GitLab webhook handling alongside the existing GitHub branch
- Verifies `x-gitlab-token` against `GITLAB_REVIEW_AGENT_WEBHOOK_SECRET`
- Handles `Merge Request Hook` events (auto-review on `open`, `update`, `reopen`) and `Note Hook` events (manual `/review` command on MR comments)
- Initialises a `Gitlab` client at module load if `GITLAB_REVIEW_AGENT_TOKEN` is set
### `packages/web/src/app/(app)/agents/page.tsx`
- Split the single "Review Agent" card into two separate cards: **GitHub Review Agent** and **GitLab Review Agent**, each showing its own configuration status
- Removed `OPENAI_API_KEY` from the GitHub card's required env vars (no longer applicable)
### `packages/web/src/app/(app)/components/navigationMenu/navigationItems.tsx` & `index.tsx`
- Added an **Agents** nav item (with `BotIcon`) between Repositories and Settings
- Visible when the user is authenticated **and** at least one agent is configured (GitHub App triple or GitLab token pair), computed in the server component and passed down as `isAgentsVisible`
### `packages/shared/src/env.server.ts`
Added four new environment variables:
| Variable | Purpose |
|---|---|
| `GITLAB_REVIEW_AGENT_WEBHOOK_SECRET` | Verifies the `x-gitlab-token` header on incoming webhooks |
| `GITLAB_REVIEW_AGENT_TOKEN` | Personal or project access token used for GitLab API calls |
| `GITLAB_REVIEW_AGENT_HOST` | GitLab hostname (defaults to `gitlab.com`; set for self-hosted instances) |
| `REVIEW_AGENT_MODEL` | `displayName` of the configured language model to use for reviews; falls back to the first model if unset or not matched |
### `packages/web/package.json`
Added `@gitbeaker/rest` dependency (already used in `packages/backend`).
---
## Bug fixes
| Bug | Fix |
|---|---|
| `"Not authenticated"` when fetching file content from the review agent | `fetchFileContent` now calls `getFileSourceForRepo` directly instead of `getFileSource` (which gates on session auth) |
| `"diff_refs is missing"` when posting GitLab MR reviews | `gitlabMrParser` now fetches the full MR via `MergeRequests.show()` instead of relying on the webhook payload, which omits `diff_refs` on `update` events |
| Bedrock/Anthropic rejection: `"A conversation must start with a user message"` | `invokeDiffReviewLlm` now passes the prompt via `system` + a `prompt` user turn instead of a `system`-role entry inside `messages` |
| Review agent silently used `models[0]` with no way to specify a different model | New `REVIEW_AGENT_MODEL` env var selects by `displayName` |
* Add CHANGELOG.md entry
* Address coderabbit review comments
* Update `review-agent` docs
* Add `zod` schema to Gitlab events
* Fix `repoName` calculation for Gitlab MRs
* Add type annotation
* Make sure that `GITLAB_REVIEW_AGENT_HOST` is valid hostname
* Address final Coderabbit comments
* Include the security fixes from #1134
---------
Co-authored-by: Gavin Williams <gavin.williams@getchip.uk>
Co-authored-by: Brendan Kellam <brendan@sourcebot.dev>
Fixes SOU-930
Summary
This PR addresses CodeQL alert #19 (js/path-injection) by adding path validation to the
invokeDiffReviewLlmfunction in the review agent.Problem
The
reviewAgentLogPathparameter was passed tofs.appendFileSyncwithout validation, creating a latent path injection vulnerability. While the current code constructs the path safely usingpullRequest.number(which is always an integer from the GitHub API), the function itself did not validate the path, meaning any future changes to the call chain could introduce a security issue.Solution
Added a
validateLogPathhelper function that:DATA_CACHE_DIR/review-agent/)The validation is performed before each
fs.appendFileSynccall to ensure defense-in-depth.Additionally, the log directory constant is now exported from
invokeDiffReviewLlm.tsand shared withapp.tsto ensure a single source of truth.Changes
pathimport toinvokeDiffReviewLlm.tsREVIEW_AGENT_LOG_DIRconstant for the expected log directoryvalidateLogPath()helper function that validates paths are within the expected directoryfs.appendFileSynccalls (prompt and response logging)app.tsto import and useREVIEW_AGENT_LOG_DIRinstead of defining it locallyTesting
References
Linear Issue: SOU-930
Summary by CodeRabbit
Bug Fixes
Documentation